Calculating Statistical Similarity between Sentences

نویسندگان

  • Junsheng Zhang
  • Yunchuan Sun
  • Huilin Wang
  • Yanqing He
چکیده

Sentence similarity plays an important role in text-related research and applications. It is closely related to word similarity and document similarity. The statistical similarity measures between sentences, based on symbolic characteristics and structural information, could measure the similarity between sentences without any prior knowledge but only on the statistical information of sentences. This paper presents several approaches to calculating statistical similarity between sentences on a test corpus of 40 sentences. These measures can be used in short text related applications such as corpus construction and title/abstract based document recommendation. The evaluation results show the differences of these measures.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Sentence Semantic Similarity Calculating Method Based on Segmented Semantic Comparison

In order to calculate sentence semantic similarity more accurately, a sentence semantic similarity calculating method based on segmented semantic comparison was proposed. Sentences would be divided into the trunk and the other segments by some grammar rules, and each segment might be divided into several shorter segments. When calculating the sentence semantic similarity between two sentences, ...

متن کامل

Calculating the similarity between words and sentences using a lexical database and corpus statistics

Calculating the semantic similarity between sentences is a long dealt problem in the area of natural language processing. The semantic analysis field has a crucial role to play in the research related to the text analytics. The semantic similarity differs as the domain of operation differs. In this paper, we present a methodology which deals with this issue by incorporating semantic similarity ...

متن کامل

Measuring the sentence level similarity

This article describes a method used to calculate the similarity between short English texts, specifically of sentence length. The described algorithm calculates semantic and word order similarities of two sentences. In order to do so, it uses a structured lexical knowledge base and statistical information from a corpus. The described method works well in determining sentence similarity for mos...

متن کامل

DERI&UPM: Pushing Corpus Based Relatedness to Similarity: Shared Task System Description

In this paper, we describe our system submitted for the semantic textual similarity (STS) task at SemEval 2012. We implemented two approaches to calculate the degree of similarity between two sentences. First approach combines corpus-based semantic relatedness measure over the whole sentence with the knowledge-based semantic similarity scores obtained for the words falling under the same syntac...

متن کامل

Calculating Semantic Similarity between Academic Articles using Topic Event and Ontology

Determining semantic similarity between academic documents is crucial to many tasks such as plagiarism detection, automatic technical survey and semantic search. Current studies mostly focus on semantic similarity between concepts, sentences and short text fragments. However, document-level semantic matching is still based on statistical information in surface level, neglecting article structur...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2011